Blind Signal Separation and Speech Recognition in the Frequency Domain
نویسندگان
چکیده
In this paper it is shown that a Blind Signal Separation (BSS) method in the frequency domain (FDBSS) improves significantly the speaker Signal to Interference Ratio (SIR) and the phoneme recognition score of a continuous speech, speaker-independent acoustic decoder in a multi-simultaneous-speaker office environment. Specifically, the efficiency of the presented FDBSS method is studied on a TITO (Two Input, Two Output) network. In extensive experiments in an artificially created environment using real-room impulse responses, the mean SIR resulting from the Output Decorrelation was increased by approximately 8dB. Furthermore, a percentage phoneme recognition improvement of 85% and 116% for each one of the separated speech signals compared to the mixed signals was measured. It is also shown that the complexity of the FDBSS method is significantly lower than in the time domain and for M-order linear separating filters is O(MlogM) compared to the O(M) in the time domain.
منابع مشابه
Doctoral Dissertation Blind Source Separation Based on Multistage Independent Component Analysis
A hands-free speech recognition system and a hands-free telecommunication system are essential for realizing an intuitive, unconstrained, and stress-free human-machine interface. In real acoustic environments, however, the speech recognition performance and a speech recording performance significantly degraded because we cannot detect the user’s speech with a high signal-to-noise ratio (SNR) ow...
متن کاملSpeech extraction in a car interior using frequency-domain ICA with rapid filter adaptations
This paper describes two new algorithms for blind source separation (BSS) based on frequency-domain independent component analysis (FDICA). One is FDICA with prefiltering by a speech sub-band passing filter to slow down the learning speed in low signal-to-noise ratio (SNR) sub-bands. The other is FDICA with sub-band selection learning to reduce the number of iterations for those sub-bands. The ...
متن کاملA Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement
A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...
متن کاملBlind separation of multiple speakers in a multipath environment
We relate information theoretic blind learning methods (infomax) and Bussgang blind equalization methods. The multipath extension of blind source separation methods can be seen in the frequency domain using FIR matrix algebra (matrices of nite impulse response lters). Three forms of Bussgang algorithms are given. The blind serial update method of Cardoso and Laheld is related to the infomax obj...
متن کاملImproving simultaneous speech recognition in real room environments using overdetermined blind source separation
In this paper we present a novel solution to the Overdetermined Blind Speech Separation (OBSS) problem for improving speech recognition accuracy of N simultaneous speakers in real room environments using M, (M>N) microphones. The proposed OBSS system uses basic NxN Blind Speech Separation networks that process in parallel all different combinations of the available mixture signals in the freque...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007